An experimental comparison of multiple vocoder types
نویسندگان
چکیده
This paper presents an experimental comparison of a broad range of the leading vocoder types which have been previously described. We use a reference implementation of each of these to create stimuli for a listening test using copy synthesis. The listening test is performed using both Lombard and normal read speech stimuli, and with two types of question for comparison. Multi-dimensional Scaling (MDS) is conducted on the listener responses to analyse similarities in terms of quality between the vocoders. Our MDS and clustering results show that the vocoders which use a sinusoidal synthesis approach are perceptually distinguishable from the source-filter vocoders. To help further interpret the axes of the resulting MDS space, we test for correlations with standard acoustic quality metrics and find one axis is strongly correlated with PESQ scores. We also find both speech style and the format of the listening test question may influence test results. Finally, we also present preference test results which compare each vocoder with the natural speech.
منابع مشابه
Speech coding using trajectory compression and multiple sensors
This paper presents a new method of multi-frame speech coding based upon polynomial approximation of speech feature trajectories incorporating multiple sensor signals from microphones, accelerometer, electro-glottograph, and microradar. The trajectory polynomial approximation exploits the inter-frame information redundancy encountered in natural speech. The trajectory method is applicable to fe...
متن کاملStatistical Voice Conversion with WaveNet-Based Waveform Generation
This paper presents a statistical voice conversion (VC) technique with the WaveNet-based waveform generation. VC based on a Gaussian mixture model (GMM) makes it possible to convert the speaker identity of a source speaker into that of a target speaker. However, in the conventional vocoding process, various factors such as F0 extraction errors, parameterization errors and over-smoothing effects...
متن کاملUsing FFI Interpolator and VQ Quantization for Designing of High Quality 1200 BPS Speech Vocoder
Storaging or transmission of speech signals at very low bit rate is a hot area in the field of speech processing. We used stochastic inter-frame interpolators and vector quantization (VQ) as a new method for developing a high quality 1200 BPS speech vocoder. The objective and subjecgtive test results show that performance of the new vocoder is compairable with 4800 BPS standard vocoders (as CELP).
متن کاملHuman Speech Production Based on a Linear Predictive Vocoder – An Interactive Tutorial
This tutorial explains the principle of the human speech production with the aid of a Linear Predictive Vocoder (LPC vocoder) and the use of interactive learning procedures. The components of the human speech organ, namely the excitation and the vocal tract parameters, are computed. The components are then fed into the synthesis part of a vocoder which finally generates a synthesised speech sig...
متن کاملA Prototype Real-time Plugin Framework for the Phase Vocoder
With the dramatic increase in computing power over the last few years, computationally intensive tasks such as the phase vocoder can now be performed faster than real-time. This paper presents details of the modifications and enhancements to the phase vocoder required to support real-time performance, and describes new implementations both as conventional command-line tools and in the form of p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013